284 research outputs found

    High performance subgraph mining in molecular compounds

    Get PDF
    Structured data represented in the form of graphs arises in several fields of the science and the growing amount of available data makes distributed graph mining techniques particularly relevant. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. The problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiver-initiated, load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening dataset, where the approach attains close-to linear speedup in a network of workstations

    Mining residue contacts in proteins using local structure predictions

    Full text link

    A Knowledge Discovery Framework for Learning Task Models from User Interactions in Intelligent Tutoring Systems

    Full text link
    Domain experts should provide relevant domain knowledge to an Intelligent Tutoring System (ITS) so that it can guide a learner during problemsolving learning activities. However, for many ill-defined domains, the domain knowledge is hard to define explicitly. In previous works, we showed how sequential pattern mining can be used to extract a partial problem space from logged user interactions, and how it can support tutoring services during problem-solving exercises. This article describes an extension of this approach to extract a problem space that is richer and more adapted for supporting tutoring services. We combined sequential pattern mining with (1) dimensional pattern mining (2) time intervals, (3) the automatic clustering of valued actions and (4) closed sequences mining. Some tutoring services have been implemented and an experiment has been conducted in a tutoring system.Comment: Proceedings of the 7th Mexican International Conference on Artificial Intelligence (MICAI 2008), Springer, pp. 765-77

    ClaSP: An Efficient Algorithm for Mining Frequent Closed Sequences

    Full text link

    Dynamic Load Balancing of Matrix-Vector Multiplications on Roadrunner Compute Nodes

    Full text link

    High-level information fusion for risk and accidents prevention in pervasive oil industry environments

    Get PDF
    Proceedings of: 12th International Conference on Practical Applications of Agents and Multi-Agent Systems, University of Salamanca (Spain), 4th-6th June, 2014.Information fusion studies theories and methods to effectively combine data from multiple sensors and related information to achieve more specific inferences that could be achieved by using a single, independent sensor. Information fused from sensors and data mining analysis has recently attracted the attention of the research community for real-world applications. In this sense, the deployment of an Intelligent Offshore Oil Industry Environment will help to figure out a risky scenario based on the events occurred in the past related to anomalies and the profile of the current employee (role, location, etc.). In this paper we propose an information fusion model for an intelligent oil environment in which employees are alerted about possible risk situations while their are moving around their working place. The layered architecture, implements a reasoning engine capable of intelligently filtering the context profile of the employee (role, location) for the feature selection of an inter-transaction mining process. Depending on the employee contextual information he will receive intelligent alerts based on the prediction model that use his role and his current location. This model provides the big picture about risk analysis for that employee at that place in that moment.This work was partially funded by CNPq BJT Project 407851/2012-

    Towards heuristic algorithmic memory

    Get PDF
    We propose a long-term memory design for artificial general intelligence based on Solomonoff's incremental machine learning methods. We introduce four synergistic update algorithms that use a Stochastic Context-Free Grammar as a guiding probability distribution of programs. The update algorithms accomplish adjusting production probabilities, re-using previous solutions, learning programming idioms and discovery of frequent subprograms. A controlled experiment with a long training sequence shows that our incremental learning approach is effective. © 2011 Springer-Verlag Berlin Heidelberg

    An experiment with association rules and classification: post-bagging and conviction

    Get PDF
    In this paper we study a new technique we call post-bagging, which consists in resampling parts of a classification model rather then the data. We do this with a particular kind of model: large sets of classification association rules, and in combination with ordinary best rule and weighted voting approaches. We empirically evaluate the effects of the technique in terms of classification accuracy. We also discuss the predictive power of different metrics used for association rule mining, such as confidence, lift, conviction and X². We conclude that, for the described experimental conditions, post-bagging improves classification results and that the best metric is conviction.Programa de Financiamento Plurianual de Unidades de I & D.Comunidade Europeia (CE). Fundo Europeu de Desenvolvimento Regional (FEDER).Fundação para a Ciência e a Tecnologia (FCT) - POSI/SRI/39630/2001/Class Project
    corecore